Testing Ensemble Methods on Prediction of Protein Secondary Structure

نویسندگان

  • Jakob V. Hansen
  • Anders Krogh
چکیده

The geometric opinion pool (GOP) ensemble method uses a multiplicative combination of predictors, and it is tailored to probability estimation in multi-class problems. This enables a decomposition of the KullbackLeibler entropy error function into an ambiguity term and an average error term. This can be used to estimate generalization error with a combination of cross-validation and estimation of ambiguity from unlabeled data. The method is tested empirically and found to compare favorably with single artificial neural networks, and ensemble methods using the linear opinion pool (LOP) and mean square error (MSE). The test problem is the prediction of secondary structure of proteins.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

A General Method for Combining Predictors Tested on Protein Secondary Structure Prediction

Ensemble methods, which combine several classifiers, have been successfully applied to decrease generalization error of machine learning methods. For most ensemble methods the ensemble members are combined by weighted summation of the output, called the linear average predictor. The logarithmic opinion pool ensemble method uses a multiplicative combination of the ensemble members, which treats ...

متن کامل

Combining Statistical Models for Protein Secondary Structure Prediction

We investigate the problem of combining experts to predict the secondary structure of globular proteins. We first present two different statistical models for this task. We then analyse an efficient linear combination technique, this sheds light on unexplained phenomena frequently encountered in practice for ensemble methods.

متن کامل

Ensemble of Neural Networks to Solve Class Imbalance Problem of Protein Secondary Structure Prediction

Protein secondary structures prediction (PSSP) is considered as a challenging task in bioinformatics. Many approaches have been proposed in last few decades in order to solve this problem. Despite the enhancements achieved, the prediction accuracy still remains limited. Accurate prediction of the secondary structure of proteins is a critical step in deducing tertiary structure of proteins and t...

متن کامل

Combining protein secondary structure prediction models with ensemble methods of optimal complexity

Many sophisticated methods are currently available to perform protein secondary structure prediction. Since they are frequently based on di,erent principles, and di,erent knowledge sources, signi>cant bene>ts can be expected from combining them. However, the choice of an appropriate combiner appears to be an issue in its own right. The >rst di@culty to overcome when combining prediction methods...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007